Binarization of Document Image

نویسنده

  • Kumar Reddy
چکیده

Documents Image Binarization is performed in the preprocessing stage for document analysis and it aims to segment the foreground text from the document background. A fast and accurate document image binarization technique is important for the ensuing document image processing tasks such as optical character recognition (OCR). Though document image binarization has been studied for many years, the thresholding of degraded document images is still an unsolved problem due to the high inter/intra variation between the text stroke and the document background across different document images. The handwritten text within the degraded documents often shows a certain amount of variation in terms of the stroke width, stroke brightness, stroke connection, and document background. In addition, historical documents are often degraded by the bleed. Documents are often degraded by different types of imaging artifact. These different types of document degradations tend to induce the document thresholding error and make degraded document image binarization a big challenge to most state-of-the-art techniques. The proposed method is simple, robust and capable of handling different types of degraded document images with minimum parameter tuning. It makes use of the adaptive image contrast that combines the local image contrast and the local image gradient adaptively and therefore is tolerant to the text and background variation caused by different types of document degradations. In particular, the proposed technique addresses the over-normalization problem of the local maximum minimum algorithm. At the same time, the parameters used in the algorithm can be adaptively estimated.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Document Analysis And Classification Based On Passing Window

In this paper we present Document analysis and classification system to segment and classify contents of Arabic document images. This system includes preprocessing, document segmentation, feature extraction and document classification. A document image is enhanced in the preprocessing by removing noise, binarization, and detecting and correcting image skew. In document segmentation, an algorith...

متن کامل

Ancient Document Images Enhancement Using Phase Based Binarization

In this paper, we present a phase-based binarization model for degraded document images, also a post processing method that can improve any binarization method and a ground truth generation tool. Usually, many binarization techniques are implemented in the literature for different types of binarization problems. It include an adaptive image contrast based document image binarization technique t...

متن کامل

Document Image Binarization Using Threshold Segmentation

Binarization is process to generate binary image from document image. Document image binarization has already under research from past many years, and many binarization algorithms have been proposed for different types of degraded document images. Document image Binarization is very popular to upgrade old handwritten and machine printed documents. Still to recover degraded document is very tedi...

متن کامل

Foreground-Background Regions Guided Binarization of Camera-Captured Document Images

Binarization is an important preprocessing step in several document image processing tasks. Nowadays handheld camera devices are in widespread use, that allow fast and flexible document image capturing. But, they may produce degraded grayscale image, especially due to bad shading or non-uniform illumination. State-of-the-art binarization techniques, which are designed for scanned images, do not...

متن کامل

A Survey on Degraded Document Image Binarization Techniques

the method of segmentation in the image binarization technique is the major technique used for the separation of pixel values into dual collections, black as foreground and white as background. The degraded images of a document are segmented by using the image binarization technique in order to acquire the clear images exact to that of the original images of documents. Thresholding process is t...

متن کامل

A Combination of Laplacian Energy, Global and Adaptive Techniques for Degraded Document Image Binarization

Many document image binarization algorithms have previously been proposed for enhancing the performance of degraded document image binarization. This paper reviews algorithms for document image binarization. All of the algorithms have some advantages and disadvantages. To remove the drawbacks in this paper a combined approach is proposed that first combines different types of global and local t...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016